<html>
<head>
<title>firstworks   Tuning SQL Relay</title>
<link href="css/styles.css" rel="stylesheet">
</head>
<body>

<span class="heading1">Tuning SQL Relay</span><br><br>

<ul>
<li><a href="#localdbconnection">Connect To The Database Using a Local Connection</a></li>
<li><a href="#unixsocket">Connect To SQL Relay Using a Unix Socket</a></li>
<li><a href="#handoff">Use File Descriptor Passing</a></li>
<li><a href="#systemparameters">Tweak Various System Parameters</a></li>
<li><a href="#dontgetcolumninfo">Don't Get Unnecessary Column Info</a></li>
<li><a href="#resultsetbuffersize">Use Result Set Buffer Size</a></li>
<li><a href="#memoryusage">Memory Usage</a></li>
<li><a href="#howmanyconnections">How Many Connections Should I Run?</a></li>
</ul>

<br>
<a name="localdbconnection"></a>
<span class="heading2">Connect To The Database Using a Local Connection</span><br>

<p>One of the most common mistakes that people make when using SQL Relay is
running SQL Relay on the same machine as the database, but not using a "local
connection" to the database.  It's an easy mistake to make, but when remedied,
it can result in a substantial improvement in performance.</p>

<p>Most databases support "remote connections" (usually over an inet socket) and
"local connections" (usually over a unix socket).  If you're running SQL Relay
on a seperate machine from the database, you have to configure it to connect to
the database using a remote connection.  However, if you're running SQL Relay
on the same machine as the database, then you should configure it to connect to
the database using a local connection if possible.</p>

<span class="heading3">MySQL</span>

<p>MySQL supports local connections over a unix socket.</p>

<p>By default, MySQL is usually configured to accept local connections.  To make
sure MySQL is configuerd to accept local connections, edit the MySQL
configuration file (usually /etc/my.cnf) and add a line like the following to
the [mysqld] section if it's not already there:</p>

<blockquote>
<font color="#298a52"><b>socket=</b></font>/var/lib/mysql/mysql.sock
</blockquote>

<p>Now restart MySQL.</p>

<p>To configure SQL Relay to connect to MySQL using the local connection,
edit the sqlrelay.conf file and make sure that the connectstring for the
MySQL instance uses the socket parameter, as follows:</p>

<blockquote>
<font color="#ff00ff">user=testuser;password=testpassword;db=testdb;socket=/var/lib/mysql/mysql.sock</font>
</blockquote>

<p>as opposed to:</p>

<blockquote>
<font color="#ff00ff">user=testuser;password=testpassword;db=testdb;host=localhost;port=5432</font>
</blockquote>

<span class="heading3">PostgreSQL</span>

<p>PostgreSQL supports local connections over a unix socket.</p>

<p>By default, PostgreSQL is configured to accept local connections.  I'm not
sure you can even turn it off.  By default, PostgreSQL listens on
/tmp/.s.PGSQL.5432.  The directory can be overridden by modifying the
PostgreSQL configuration file (usually /var/lib/pgsql/data/postgresql.conf).
The following parameters affect which file PostgreSQL listens on and that
file's properties.  By default these parameters are commented out.  To modify
the parameters, remove the leading # and set the parameter.  By default, the
parameters look like this:</p>

<blockquote>
<pre>
<font color="#0000ff">#unix_socket_directory = ''</font>
<font color="#0000ff">#unix_socket_group = ''</font>
<font color="#0000ff">#unix_socket_permissions = 0777 # octal</font>

<font color="#0000ff">#port = 5432</font>
</pre>
</blockquote>

<p>The socket file that PostgreSQL will listen on will be found in /tmp by
default, or in whatever directory is specified by unix_socket_directory.  The
file always start with .s.PGSQL. and end with whatever port is specified with
the port parameter.  Note that the port parameter also determines which inet
port PostgreSQL will listen on.  By default it will have 777 permissions and
be owned by whatever user and group the postmaster process is running as.  You
can override the permissions using the unix_socket_permissions parameter and
the group using the unix_socket_group parameter.</p>

<p>The postmaster process also takes several command line parameters.  The -k
parameter overrides the unix_socket_directory parameter and the -p parameter 
overrides the port parameter.</p>

<p>After you modify postgresql.conf, restart PostgreSQL.</p>

<p>To configure SQL Relay to connect to PostgreSQL using the local connection,
edit the sqlrelay.conf file and make sure that the connectstring for the
PostgreSQL instance uses the host and port parameters as follows:</p>

<blockquote>
<font color="#ff00ff">user=testuser;password=testpassword;host=/tmp;port=5432;</font>
</blockquote>

<p>as opposed to:</p>

<blockquote>
<font color="#ff00ff">user=testuser;password=testpassword;host=localhost;port=5432;</font>
</blockquote>

<p>Ordinarily, the host parameter tells SQL Relay what host to connect to, but
if it starts with a / then it indicates what directory to look for the socket
file in.  Likewise, the port parameter ordinarily tells SQL Relay what port to
connect to, but if host starts with a / then it indicates that SQL Relay should
look for a file in the directory specifid by the host parameter starting with
.s.PGSQL. and ending in the port number.</p>

<span class="heading3">Oracle</span>

<p>Oracle supports local connections over IPC using shared memory.</p>

<p>By default, Oracle isn't usually set up to listen for IPC connections.
To set up Oracle to listen for IPC connections, edit
$ORACLE_HOME/network/admin/listener.ora and look for a section like this:</p>

<blockquote>
<pre>
LISTENER =
  (<font color="#298a52"><b>DESCRIPTION_LIST</b></font> =
    (<font color="#298a52"><b>DESCRIPTION</b></font> =
      (<font color="#298a52"><b>ADDRESS_LIST</b></font> =
        (<font color="#298a52"><b>ADDRESS</b></font> = (<font color="#a52829"><b>PROTOCOL</b></font> = <font color="#a52829"><b>TCP</b></font>)(<font color="#a52829"><b>HOST</b></font> = localhost.localdomain)(<font color="#a52829"><b>PORT</b></font> = 1521))
      )
    )
  )
</pre>
</blockquote>

<p>Add an additional ADDRESS section as follows:</p>

<blockquote>
<pre>
LISTENER =
  (<font color="#298a52"><b>DESCRIPTION_LIST</b></font> =
    (<font color="#298a52"><b>DESCRIPTION</b></font> =
      (<font color="#298a52"><b>ADDRESS_LIST</b></font> =
        (<font color="#298a52"><b>ADDRESS</b></font> = (<font color="#a52829"><b>PROTOCOL</b></font> = <font color="#008a8c">IPC</font>)(<font color="#a52829"><b>KEY</b></font> = EXTPROC))
        (<font color="#298a52"><b>ADDRESS</b></font> = (<font color="#a52829"><b>PROTOCOL</b></font> = <font color="#a52829"><b>TCP</b></font>)(<font color="#a52829"><b>HOST</b></font> = localhost.localdomain)(<font color="#a52829"><b>PORT</b></font> = 1521))
      )
    )
  )
</pre>
</blockquote>

<p>Now Oracle's listener is configured to listen for both IPC and inet socket
connections.  Restart Oracle</p>

<p>SQL Relay looks up the ORACLE_SID in the
$ORACLE_HOME/network/admin/tnsnames.ora file and uses the configuration
parameters there to connect to the database.  By default, the connection to the
database is configured to use an inet socket.  To set up an ORACLE_SID to use
IPC instead, edit $ORACLE_HOME/network/admin/tnsnames.ora and look for a
section for the ORACLE_SID like this:</p>

<blockquote>
<pre>
ORA1 =
  (<font color="#298a52"><b>DESCRIPTION</b></font> =
    (<font color="#298a52"><b>ADDRESS</b></font> = (<font color="#a52829"><b>PROTOCOL</b></font> = <font color="#a52829"><b>TCP</b></font>)(<font color="#a52829"><b>HOST</b></font> = localhost.localdomain)(<font color="#a52829"><b>PORT</b></font> = 1521))
    (<font color="#298a52"><b>CONNECT_DATA</b></font> =
      (<font color="#a52829"><b>SERVER</b></font> = <font color="#008a8c">DEDICATED</font>)
      (<font color="#a52829"><b>SERVICE_NAME</b></font> = ora1)
    )
  )
</pre>
</blockquote>

<p>Modify it as follows:</p>

<blockquote>
<pre>
ORA1 =
  (<font color="#298a52"><b>DESCRIPTION</b></font> =
    (<font color="#298a52"><b>ADDRESS</b></font> = (<font color="#a52829"><b>PROTOCOL</b></font> = <font color="#008a8c">IPC</font>)(<font color="#a52829"><b>KEY</b></font> = EXTPROC))
    (<font color="#298a52"><b>CONNECT_DATA</b></font> =
      (<font color="#a52829"><b>SERVER</b></font> = <font color="#008a8c">DEDICATED</font>)
      (<font color="#a52829"><b>SERVICE_NAME</b></font> = ora1)
    )
  )
</pre>
</blockquote>

<p>SQL Relay requires no specific modifications to connect to the database using
IPC instead of an inet socket.  Just configure the connect string to use an SID
that uses IPC to connect to the database.  For example:</p>

<blockquote>
<font color="#ff00ff">user=testuser;password=testpassword;oracle_sid=ora1;</font>
</blockquote>

<span class="heading3">SQLite</span>

<p>SQLite is different from other databases.  SQL Relay's sqlite connection
uses the sqlite library to access the database file directly.  Effectively,
the database connection is always local.</p>

<span class="heading3">IBM DB2</span>

<p>As far as I can tell, DB2 supports local connections over IPC using
shared memory, semaphores and message queues (though I can't find any
documentation that specifically says that).</p>

<p>By default, clients connect to DB2 databases using local connections.  In
fact, to access a remote instance, you have to set up a stub local instance
which relays to the remote instance and connect to the local instance.</p>

<p>SQL Relay requires no specific modifications to connect to the database.
Just configure the connect string to use the local database.  For example:</p>

<blockquote>
<font color="#ff00ff">db=testdb</font>
</blockquote>

<span class="heading3">Sybase</span>

<p>As far as I can tell, Sybase only supports client connections over inet
sockets.  If anyone knows differently, send mail to
<a href="mailto:david.muse@firstworks.com">david.muse@firstworks.com</a>.</p>

<span class="heading3">Firebird</span>

<p>As far as I can tell, Firebird only supports client connections
over inet sockets.  If anyone knows differently, send mail to
<a href="mailto:david.muse@firstworks.com">david.muse@firstworks.com</a>.</p>

<br>
<a name="unixsocket"></a>
<span class="heading2">Connect To SQL Relay Using a Unix Socket</span><br>

<p>SQL Relay supports inet and unix sockets.  If you are runing the SQL Relay
client (your program) and the SQL Relay server on the same machine, make sure
that they are talking to each other over the unix socket.</p>

<p>First make sure that SQL Relay is configured to listen on a unix
socket.  In the sqlrelay.conf file, the instance that is talking to mysql
should have a socket attribute:</p>

<blockquote>
<instance id="mysqltest" port="8006" socket="/tmp/mysqltest.socket" ... >
</blockquote>

<p>In the above example, the SQL Relay server will listen on inet port 8006 and
unix socket /tmp/mysqltest.socket.</p>

<p>Now make sure that your app is using that socket as well.  When you create
a new sqlrconnection, make sure you tell it to use the unix socket:</p>

<blockquote>
sqlrconnection	*con=new sqlrconnection("localhost",8006,"/tmp/mysqltest.socket","user","password",0,1);
</blockquote>

<p>as opposed to:</p>

<blockquote>
sqlrconnection  *con=<font color="#a52829"><b>new</b></font> sqlrconnection(<font color="#ff00ff">&quot;localhost&quot;</font>,<font color="#ff00ff">8006</font>,<font color="#ff00ff">NULL</font>,<font color="#ff00ff">&quot;user&quot;</font>,<font color="#ff00ff">&quot;password&quot;</font>,<font color="#ff00ff">0</font>,<font color="#ff00ff">1</font>);
</blockquote>

<p>(for non-C++ api's the syntax is similar)</p>

<p>The client will first try the unix socket, then if it can't connect, it will
try the host/port.</p>

<p>Make sure to type the filename of the unix socket correctly :)  I've seen
several cases where people were really confused why they were getting bad
performance even though they were using unix sockets.  They had just typed the
unix socket file name wrong and it was trying it, failing, then falling back to
host/port.</p>


<br>
<a name="handoff"></a>
<span class="heading2">Use File Descriptor Passing</span><br>

<p>SQL Relay has a listener process that waits for incoming clients, then hands
them off to a seperate process that is already logged into the database when one
of those processes is available.  The handoff can be done using file descriptor
passing or by causing the client to disconnect from the listener and reconnect
to the other process.  File descriptor passing is much faster and uses fewer
system resources.</p>

<p>In the sqlrelay.conf file, for the instance that is talking to mysql, make
sure the handoff attribute is set to "pass":

<blockquote>
<font color="#008a8c">&lt;</font><font color="#008a8c">instance</font><font color="#008a8c"> </font><font color="#298a52"><b>id</b></font>=<font color="#ff00ff">&quot;mysqltest&quot;</font><font color="#008a8c"> </font><font color="#298a52"><b>port</b></font>=<font color="#ff00ff">&quot;8006&quot;</font><font color="#008a8c"> ... </font><font color="#298a52"><b>handoff</b></font>=<font color="#ff00ff">&quot;pass&quot;</font><font color="#008a8c"> ... &gt;</font>
</blockquote>

<p>as opposed to:</p>

<blockquote>
<font color="#008a8c">&lt;</font><font color="#008a8c">instance</font><font color="#008a8c"> </font><font color="#298a52"><b>id</b></font>=<font color="#ff00ff">&quot;mysqltest&quot;</font><font color="#008a8c"> </font><font color="#298a52"><b>port</b></font>=<font color="#ff00ff">&quot;8006&quot;</font><font color="#008a8c"> ... </font><font color="#298a52"><b>handoff</b></font>=<font color="#ff00ff">&quot;reconnect&quot;</font><font color="#008a8c"> ... &gt;</font>
</blockquote>


<br>
<a name="systemparameters"></a>
<span class="heading2">Tweak Various System Parameters</span><br>

<p>Certain system parameters can be tuned to get better performance out of
SQL Relay.</p>

<span class="heading3">TIME_WAIT timeout</span>

<p>The first paramater that comes to mind is the TIME_WAIT timeout.  When a TCP 
client disconnects from a server, the socket that the client was connected on
goes into a TIME_WAIT state for typically between a minute and 4 minutes.</p>

<p>For servers serving data over the the unreliable internet, this is probably 
reasonable.  For internal servers, dedicated to serving other internal servers 
on a reliable network, reducing the length of the timeout is probably OK.</p>

<p>Here's why it helps...</p>

<p>The kernel keeps a list of sockets in the TIME_WAIT state.  When the list
is full, failures start to occur.</p>

<p>If your server is getting new client connections faster than it can bleed off
sockets in the TIME_WAIT state, the list will ultimately get full.  Decreasing
the timeout increases the bleed-off rate.</p>

<p>The following instructions illustrate how to change the timeout rate for
Linux and Solaris.  Note that I got these instructions off of the web and have
not tried all of them myself.</p>

<p>For Linux, set the timeout by executing the following command.  In this
example, the timeout is set to 30 seconds.  You should put this command in a
system startup file so it will be executed at boot time.</p>

<p>echo 30 &gt; /proc/sys/net/ipv4/tcp_fin_timeout</p>

<p>You might also try enabling tcp_tw_recycle and tcp_tw_reuse as follows:</p>

<p>echo 1 &gt; /proc/sys/net/ipv4/tcp_tw_reuse<br>
echo 1 &gt; /proc/sys/net/ipv4/tcp_tw_recycle<br></p>

<p>Those parameters enable fast recycling and reuse of sockets in the
TIME_WAIT state.  I can't find any documentation about them other than the
following though:</p>

<blockquote><pre>
tcp_tw_recycle - BOOLEAN
 Enable fast recycling TIME-WAIT sockets. Default value is 0.
 It should not be changed without advice/request of technical
 experts.

tcp_tw_reuse - BOOLEAN
 Allow to reuse TIME-WAIT sockets for new connections when it is
 safe from protocol viewpoint. Default value is 0.
 It should not be changed without advice/request of technical
 experts.
</pre></blockquote>

<p>For Solaris, the parameter can be modified while the system is running
using the ndd command to set the number of milliseconds to wait.  These
examples set the timeout to 30 seconds.  You should put these commands in a 
system startup file so they'll be executed at boot time.</p>

<p>For Solaris 2.6 and earlier:  ndd -set /dev/tcp tcp_close_wait_interval 30000</p>

<p>For Solaris 2.7 and later:  ndd -set /dev/tcp tcp_time_wait_interval 30000</p>

<span class="heading3">Port range</span>

<p>Another paramter that you may want to tweak is the range of available ports.
On Linux 2.2 kernels, it defaults to ports 1024 through 4999.  You can display
the range by running:</p>

<p>/sbin/sysctl net.ipv4.ip_local_port_range</p>

<p>You can increase this to range from 1024 to 65535 by running the following 
command:</p>

<p>/sbin/sysctl -w net.ipv4.ip_local_port_range="1024 65535"</p>

<p>You should put this command in a system startup file so it'll be executed at
boot time.</p>

<p>I'm not sure what the default port range is or how to change it on other
operating systems.</p>


<br>
<a name="dontgetcolumninfo"></a>
<span class="heading2">Don't Get Unnecessary Column Info</span><br>

<p>If you run select queries but don't need to know the names/types/sizes of
the columns, rather you can just refer to them by number, then you can call
dontGetColumnInfo() before running your query and much less data will be
transmitted from the SQL Relay server to the client.  This is really useful for
speeding up 1 column queries like "select count(*) from mytable".</p>


<br>
<a name="resultsetbuffersize"></a>
<span class="heading2">Use Result Set Buffer Size</span><br>

<p>By default, SQL Relay fetches the entire result set of a query into memory.
SQL Relay was originally written for use with web-based apps, and most web
pages display small result sets so generally it's faster if the entire result
set is buffered.  But, if you run select queries that return really long result
sets, you should use setResultSetBufferSize() to fetch rows in groups instead
of all at once or 1 at a time.  The value to use tends to be query-specific
though so you usually have to do some benchmarking to get it optimized.  If you
fetch too many rows at once, memory has to be allocated which slows things down
(and in really bad cases can cause swapping) but if you fetch too few at once,
it causes too many round trips to the database.  Each query has a sweet spot,
finding it may dramatically improve that query's performance.</p>

<br>
<a name="memoryusage"></a>
<span class="heading2">Memory Usage</span><br>

<p>Some databases only support single-row fetches; you can run a query,
then fetch 1 row at a time from the database.  Oracle, DB2 and Sybase support
array fetches; you can run a query and then fetch any number of rows from
the database in a single round-trip.  This improves performance, but at a cost.
For each cursor, a buffer which is large enough to accommodate the maximum size
of the data that might be fetched, must be allocated and used to fetch the
rows.</p>

<p>For DB2 and Sybase, this is a 10mb buffer.  For Oracle, it's a 40mb
buffer.</p>

<p>If that seems like more memory than you'd like to allocate per-cursor,
then you can tune it down.  Unfortunately, for now, except for Oracle, the
tunable parameters are compile-time parameters.</p>

<p>For Oracle, the fetchatonce, maxselectlistsize and maxitembuffersize
parameters of the string attribute of the connection tag in the sqlrelay.conf
file may be set to control these values at run time.</p>

<p>For non-Oracle databases, in the following files...</p>

<ul>
<li>src/connections/db2/db2connection.h</li>
<li>src/connections/sybase/sybaseconnection.h</li>
</ul>

<p>There are lines like this:</p>

<blockquote>
<pre>
<font color="#a020f0">#define FETCH_AT_ONCE           </font><font color="#ff00ff">10</font>
<font color="#a020f0">#define MAX_SELECT_LIST_SIZE    </font><font color="#ff00ff">256</font>
<font color="#a020f0">#define MAX_ITEM_BUFFER_SIZE    </font><font color="#ff00ff">4096</font>
</pre>
</blockquote>

<p>FETCH_AT_ONCE controls the number of rows that are fetched from the database
at once.  MAX_SELECT_LIST_SIZE controls the maximum number of columns that may
be fetched during a select.  MAX_ITEM_BUFFER_SIZE is the maximum size of an
individual field that may be returned by the database during a select.</p>

<p>The default values for FETCH_AT_ONCE and MAX_SELECT_LIST_SIZE are 10 and
256 respectively, for all databases.  The value of MAX_ITEM_BUFFER_SIZE
depends on the maximum number of characters that a varchar or varchar2 column
type can store and varies between databases.</p>

<p>The number of bytes that the database connection daemon allocates for
fetching multiple rows of result sets is equal to
FETCH_AT_ONCE * MAX_SELECT_LIST_SIZE * MAX_ITEM_BUFFER_SIZE.  So, reducing any
of those sizes will reduce the amount of memory used by each connection
daemon.</p>

<p>Use care when reducing MAX_SELECT_LIST_SIZE and MAX_ITEM_BUFFER_SIZE.  If you
reduce MAX_SELECT_LIST_SIZE to 20, but then run a query that fetched 25 columns,
the last 5 will be truncated.  Similarly, if you set MAX_ITEM_BUFFER_SIZE to
2048, but then fetch a varchar2 field with 3000 characters in it, the last 952
characters will be truncated.  However, if you know that none of the queries
your applications run fetch more than some number of rows, then it is safe to
reduce the MAX_SELECT_LIST_SIZE to that number.  Similarly, if you know that
none of the varchar or varchar2 columns that your applications access are
larger than a particular size, it's safe to reduce MAX_ITEM_BUFFER_SIZE to
that size.</p>

<p>Reducing FETCH_AT_ONCE will cause a degredation in performance, as more 
round-trips between SQL Relay and the database have to be made.  But,
performance does not degrade on 1-to-1 basis.  For example, reducing
FETCH_AT_ONCE from 10 to 5 will not your apps will not run half as fast.</p>

<p>After modifying these parameters, you must rebuild and re-install SQL Relay 
for the changes to have an effect.  If you have already run the <i>configure</i>
script, then you should do a <i>make clean</i>, followed by a <i>make</i> and
<i>make install</i> at the top level of the SQL Relay source distribution.
If you have not run the <i>configure</i> script, or have recently run a 
<i>make distclean</i> then just proceed as if you were building and installing
SQL Relay for the first time.  See
<a href="installing.html">Installing SQL Relay</a> for more information.</p>

<br>
<a name="howmanyconnections"></a>
<span class="heading2">How Many Connections Should I Run?</span><br>

<p>A good rule of thumb is to run as many connections as your database can
handle.</p>

<p>The best way is to determine how many connections to run is to make an
educated guess, let your applications run, monitor the performance of the
SQL Relay server and database and adjust the number of connections
accordingly.</p>

<p>If you are implementing SQL Relay because your database had been 
getting overloaded with too many connections, you may have a pretty good idea of
how many connections your database can handle, and you can use that number as a
starting point.</p>

<p>When just letting your applications run isn't feasable, you can use a
load-testing program like apachebench or LoadRunner to do automated testing.
You'll have to set up scripts and pages to simulate the kind of load that your
applications will place on SQL Relay and the database, but you can get an
accurate idea of how your application will perform end-to-end.  You can let the
load-testing program run, monitor the performance of the SQL Relay server and
database and adjust the number of connections accordingly.</p>

<p>There is also a program in the test/stress directory of the SQL Relay
distribution called <i>querytest</i> that logs into the SQL Relay server,
runs a series of queries and logs out, over and over, as fast as possible.
You can run many instances of <i>querytest</i> simultaneously to simulate
the kind of load your applications might place on SQL Relay and the database.
<i>querytest</i> is OK as a starting point, but to do any serious testing, you
should modify the queries that it runs to more accurately simulate your
applications.</p>

</body>
</html>
